AITopics | rate-distortion theory

Itisoften the responsibility of the agent designer toconstruct thistargetwhich,inrichandcomplexenvironments,constitutesaonerousburden; without full knowledge of the environment itself, a designer may forge a suboptimal learning target that poorly balances the amount ofinformation an agent must acquire to identify the target against the target's associated performance shortfall.

agent, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

The Value of Information When Deciding What to Learn

Neural Information Processing SystemsDec-24-2025, 03:02:31 GMT

All sequential decision-making agents explore so as to acquire knowledge about a particular target. It is often the responsibility of the agent designer to construct this target which, in rich and complex environments, constitutes a onerous burden; without full knowledge of the environment itself, a designer may forge a sub-optimal learning target that poorly balances the amount of information an agent must acquire to identify the target against the target's associated performance shortfall. While recent work has developed a connection between learning targets and rate-distortion theory to address this challenge and empower agents that decide what to learn in an automated fashion, the proposed algorithm does not optimally tackle the equally important challenge of efficient information acquisition. In this work, building upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014), we address this shortcoming directly to couple optimal information acquisition with the optimal design of learning targets. Along the way, we offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.

information, name change, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.77)

Add feedback

Domain Generalization without Excess Empirical Risk

Neural Information Processing SystemsAug-14-2025, 23:13:35 GMT

We present an approach that eliminates this problem.

domain generalization, empirical risk, generalization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Africa (0.04)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Value of Information When Deciding What to Learn

Neural Information Processing SystemsOct-10-2024, 08:51:57 GMT

All sequential decision-making agents explore so as to acquire knowledge about a particular target. It is often the responsibility of the agent designer to construct this target which, in rich and complex environments, constitutes a onerous burden; without full knowledge of the environment itself, a designer may forge a sub-optimal learning target that poorly balances the amount of information an agent must acquire to identify the target against the target's associated performance shortfall. While recent work has developed a connection between learning targets and rate-distortion theory to address this challenge and empower agents that decide what to learn in an automated fashion, the proposed algorithm does not optimally tackle the equally important challenge of efficient information acquisition. In this work, building upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014), we address this shortcoming directly to couple optimal information acquisition with the optimal design of learning targets. Along the way, we offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.

information acquisition, knowledge, rate-distortion theory, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Exploring Action-Centric Representations Through the Lens of Rate-Distortion Theory

Varona, Miguel de Llanza, Buckley, Christopher L., Millidge, Beren

arXiv.org Artificial IntelligenceSep-13-2024

Organisms have to keep track of the information in the environment that is relevant for adaptive behaviour. Transmitting information in an economical and efficient way becomes crucial for limited-resourced agents living in high-dimensional environments. The efficient coding hypothesis claims that organisms seek to maximize the information about the sensory input in an efficient manner. Under Bayesian inference, this means that the role of the brain is to efficiently allocate resources in order to make predictions about the hidden states that cause sensory data. However, neither of those frameworks accounts for how that information is exploited downstream, leaving aside the action-oriented role of the perceptual system. Rate-distortion theory, which defines optimal lossy compression under constraints, has gained attention as a formal framework to explore goal-oriented efficient coding. In this work, we explore action-centric representations in the context of rate-distortion theory. We also provide a mathematical definition of abstractions and we argue that, as a summary of the relevant details, they can be used to fix the content of action-centric representations. We model action-centric representations using VAEs and we find that such representations i) are efficient lossy compressions of the data; ii) capture the task-dependent invariances necessary to achieve successful behaviour; and iii) are not in service of reconstructing the data. Thus, we conclude that full reconstruction of the data is rarely needed to achieve optimal behaviour, consistent with a teleological approach to perception.

information, reconstruction, representation, (13 more...)

arXiv.org Artificial Intelligence

2409.08892

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Dataset Size Dependence of Rate-Distortion Curve and Threshold of Posterior Collapse in Linear VAE

Ichikawa, Yuma, Hukushima, Koji

arXiv.org Machine LearningSep-14-2023

In the Variational Autoencoder (VAE), the variational posterior often aligns closely with the prior, which is known as posterior collapse and hinders the quality of representation learning. To mitigate this problem, an adjustable hyperparameter beta has been introduced in the VAE. This paper presents a closed-form expression to assess the relationship between the beta in VAE, the dataset size, the posterior collapse, and the rate-distortion curve by analyzing a minimal VAE in a high-dimensional limit. These results clarify that a long plateau in the generalization error emerges with a relatively larger beta. As the beta increases, the length of the plateau extends and then becomes infinite beyond a certain beta threshold. This implies that the choice of beta, unlike the usual regularization parameters, can induce posterior collapse regardless of the dataset size. Thus, beta is a risky parameter that requires careful tuning. Furthermore, considering the dataset-size dependence on the rate-distortion curve, a relatively large dataset is required to obtain a rate-distortion curve with high rates. Extensive numerical experiments support our analysis.

artificial intelligence, machine learning, vae, (10 more...)

arXiv.org Machine Learning

2309.07663

Country: